Skip to content

feat: add helm/michelangelo Helm chart for control plane deployment#1143

Open
sallycr wants to merge 1 commit intomainfrom
helm-chart
Open

feat: add helm/michelangelo Helm chart for control plane deployment#1143
sallycr wants to merge 1 commit intomainfrom
helm-chart

Conversation

@sallycr
Copy link
Copy Markdown
Collaborator

@sallycr sallycr commented May 5, 2026

What type of PR is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation Update

What changed?

Adds helm/michelangelo/ — a first-class Helm chart that installs the full Michelangelo control plane (apiserver, envoy, ui, worker, controllermgr) into any Kubernetes cluster with a single helm install command. All 5 services are promoted from bare Pods to Deployments. Includes least-privilege RBAC, schema init containers, per-service enabled toggles, Cadence/Temporal engine guards, credential Secrets with resource-policy: keep, and a helm test hook.

Why?

The control plane was previously deployed by sandbox.py via sequential kubectl apply calls with hardcoded addresses and no self-healing. A Helm chart enables standard install/upgrade/uninstall lifecycle, works against any cluster (local k3d, staging, production), and is required for open-source users who don't use the ma CLI. Closes #1136.

How did you test it?

  • helm lint helm/michelangelo → 0 errors, 0 failures
  • helm template with no values → fails fast with "workflow.endpoint is required"
  • helm template -f values-k3d.yaml → 21 resources render clean
  • helm template with full production values (Temporal engine) → 21 resources render clean
  • helm template --set workflow.engine=invalid → clear validation error
  • helm install -f values-k3d.yaml against live k3d cluster → all 5 pods Running within 60s
  • helm test michelangelo → Phase: Succeeded
  • helm upgrade --reuse-values → zero pod restarts
  • helm uninstall → credential Secrets survive (resource-policy: keep confirmed)

Potential risks

None — this PR only adds new files under helm/. No existing code is modified; sandbox.py continues to deploy the control plane via kubectl apply unchanged until Phase 4 integration.

Release notes

N/A — additive change, no migration required.

Documentation Changes

helm/michelangelo/README.md documents prerequisites, install commands (k3d and production), full values reference, upgrade/uninstall instructions, and troubleshooting. No wiki changes needed.

Summary:
Intent:
- Convert the Michelangelo control plane from raw kubectl apply calls in
  sandbox.py to a first-class Helm chart installable against any Kubernetes
  cluster (closes #1136)
- Enable helm install for local k3d development and standard --set overrides
  for production and staging environments

Changes:
- Add helm/michelangelo/ chart with Chart.yaml, values.yaml, values-k3d.yaml,
  and 20 templates covering all 5 control plane services (apiserver, envoy, ui,
  worker, controllermgr) promoted to Deployments
- Add schema init containers on apiserver (wait-for-metadata-storage +
  schema-init) eliminating the ordering race condition in sandbox.py
- Replace boot.yaml cluster-admin with a least-privilege ClusterRole scoped
  to what controllermgr and apiserver crdSync actually need
- Rename minio-credentials to object-storage-credentials in chart templates
- Add per-service enabled toggles, Cadence/Temporal engine guards with
  fail-fast validation, and helm test hook

Test Plan:
- helm lint → 0 errors, 0 failures
- helm template with no values → fails fast with clear required-value error
- helm template -f values-k3d.yaml → 21 resources render clean
- helm template with full production values (Temporal) → 21 resources render clean
- helm install -f values-k3d.yaml against live k3d → all 5 pods Running
- helm test michelangelo → Phase: Succeeded
- helm upgrade --reuse-values → zero pod restarts
- helm uninstall → credential Secrets survive (resource-policy: keep)

Revert Plan:
- Revert this PR via git revert. The helm/ directory is additive and sandbox.py
  is unchanged — no production behavior is affected.

Closes #1136
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: publish Michelangelo control plane as a Helm chart for open source deployment

1 participant